Knowledge Utilization in Handwritten Zip Code Recognition
نویسندگان
چکیده
The process of recognizing the postal codes or zip codes in a handwritten address can be aided by many sources of external knowledge. Ci ty and state names are obvious examples that can be used in conjunction w i t h a city-state-zip directory to provide evidence about digits in a zip code. This paper describes an extension of this methodology that uses knowledge about legal street names and sujfixes to constrain the digits in a zip code. The technique does not require complete recognition of a l l characters in words. Rather, a feature description of words is used to index a set of possible zip codes. Some preliminary experiments w i t h the ZlP+4 database are discussed. It is shown that even a relat ively simple description of two words in the street line of an address can significantly reduce the number of zip codes that could appear on a piece of mail . 1 . I n t r oduc t i on Reading the zip code w i th in the destination address area of a mailpiece is of central importance in automated mail sorting [6} The problem is not always amenable to straightforward alphanumeric character recognition — particularly when the address is handwrit ten. Fortunately, the numeric zip code in a destination address does not usually occur in isolation. There is additional or redundant information in the form of a city name, a state name, a street address, the name of a destination, and perhaps an attention line Mao. there is usually a return address and sometimes advertising material. A l l this knowledge has some potential to contribute to recognition accuracy. Previous Art i f ic ial Intelligence approaches to the general reading problem have focused on the use ol intra-word knowledge [2,5]. A robust methodology is needed to focus the various sources of inter-word knowledge on the reading problem [3,4). The most obvious use of inter-word or external knowledge in handwrit ten zip code recognition is for the city and state to confirm a five digit zip code. State information constrains the first digit whereas city information constrains the second and possibly the th i rd digits. Thus if the city and state information is known, the number of alternatives is reduced for zip code recognition. There exist mult il ine reading equipment today that are capable of using city and state information as we l l as street information in determining a nine-digit zip code. Besides the knowledge in the city/state/zip line (or lines), there is a wealth of knowledge in the street line that could be util ized. Several sources of external knowledge that could be used to constrain the digits in a zip code are illustrated in Figure 1. For example, if mail for a particular city is being sorted, recognizing the pre-directional code as N might l im i t the zip code tp one that occurs north of some known boundary. A better use of external knowledge is to recognize the post-directional code (e.g., NW, NE, etc.) when sorting mail for cities that use such codes (e.g„ Washington, D.C.). This could significantly reduce the number of zip codes that could match a mail piece. Recognizing the street name or the organization would have a similar effect. If combined w i t h a suitably arranged dictionary, this would in many instances specify the zip code. The focus of this research is on a technique of using exter nal knowledge to constrain the digits in a five-digit zip code. Usually, accurate recognition of al l the characters in a word, such as the city name, is required to use external knowledge. There is usually l i t t le tolerance for broken, touching, or smeared characters. Such a recognition procedure would be appropriate for clean well-printed text, in which case there may be no need to look beyond the zip code. What is desired instead is a technique that is robust in the presence of noise and can extract some useful information from textual portions of a destination address. Such a technique should be able to constrain the digits of a postal code even if it cannot f u l l y recognize the text. The technique proposed here computes a noise-tolerant feature description of a specific word or words in an address. This feature description is then used to access a dictionary and return a number of zip codes that correspond to words w i th that feature description. This cither produces a unique identification of the zip code. or. by constraining the digits of the zip code, provides information that could be used to advantage in recognition. This methodology does not require exact recognition of all the characters in the words it examines. Only some features have to be calculated and these features may provide only a gross description of the word. However, such a feature description may provide useful knowledge even though it is tolerant to noise and easy to compute. Figure 1. Template for the face of a mailpiece.
منابع مشابه
Applying Domain Knowledge to the Recognition of Handwritten Zip Codes
We present a simple system that exploits domain knowledge to improve the segmentation and recognition of handwritten ZIP codes. Specifically, we show that the concept of metaclasses of digits, introduced by Morita et al. [16] for recognition of Brazilian bank check dates, can be extended to ZIP code recognition. We also show that, when this domain knowledge is present, integrated segmentation a...
متن کاملA novel free format Persian/Arabic handwritten zip code recognition system
Article history: Received 13 January 2012 Received in revised form 22 April 2013 Accepted 22 April 2013 Available online xxxx In Iran like many other countries, the categorization of postal envelopes is executed manually, mostly based on the handwritten addresses and zip codes. That process is still slow and prone to man-made errors. Therefore, having an automated, accurate and efficient system...
متن کاملHuman Performance in Recognition of Handwritten ZIP Codes from the CEDAR Database
We established human performance in recognition of handwritten ZIP codes taken from the standard CEDAR database. We expect that the result will serve as a benchmark for machine performance in recognition of handwritten ZIP codes.
متن کاملHandwritten ZIP code recognition using lexicon free word recognition algorithm
This paper describes a new approach to ZIP code recognition using a word recognition algorithm, where a numeral string is recognized as a word. This paper also describes an end to end ZIP code recognition system consisting of tiltlslant correction, line segmentation, word segmentation, ZIP code location, as well as the ZIP code recognition. Evaluation tests are performed using address block ima...
متن کاملA blackboard-based approach to handwritten ZIP code recognition
A methodology for recognizing ZIP codes @ostal codes) in handwritten addresses is presented. The method uses many diverse pattern recognition and image processing algorithms. Given a high-resolution image of a hand-written address block, the solution invokes routines capable of hypothesizing the location of the ZIP Code, segmenting and recognizing ZIP Code digits, locating and recognizing City ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1987